Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior.

نویسندگان

  • Itay Mayrose
  • Dan Graur
  • Nir Ben-Tal
  • Tal Pupko
چکیده

The degree to which an amino acid site is free to vary is strongly dependent on its structural and functional importance. An amino acid that plays an essential role is unlikely to change over evolutionary time. Hence, the evolutionary rate at an amino acid site is indicative of how conserved this site is and, in turn, allows evaluation of its importance in maintaining the structure/function of the protein. When using probabilistic methods for site-specific rate inference, few alternatives are possible. In this study we use simulations to compare the maximum-likelihood and Bayesian paradigms. We study the dependence of inference accuracy on such parameters as number of sequences, branch lengths, the shape of the rate distribution, and sequence length. We also study the possibility of simultaneously estimating branch lengths and site-specific rates. Our results show that a Bayesian approach is superior to maximum-likelihood under a wide range of conditions, indicating that the prior that is incorporated into the Bayesian computation significantly improves performance. We show that when branch lengths are unknown, it is better first to estimate branch lengths and then to estimate site-specific rates. This procedure was found to be superior to estimating both the branch lengths and site-specific rates simultaneously. Finally, we illustrate the difference between maximum-likelihood and Bayesian methods when analyzing site-conservation for the apoptosis regulator protein Bcl-x(L).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Bayesian model comparison approach to inferring positive selection.

A popular approach to detecting positive selection is to estimate the parameters of a probabilistic model of codon evolution and perform inference based on its maximum likelihood parameter values. This approach has been evaluated intensively in a number of simulation studies and found to be robust when the available data set is large. However, uncertainties in the estimated parameter values can...

متن کامل

Inferring Site-Specific Evolutionary Rates: Bayesian Methods are Superior

Not all sites in protein sequences evolve at the same rate during the course of evolution. The working hypothesis assumes that evolutionary conserved sites in a protein sequence point to functionally important regions. Thus, identifying conserved regions in a protein impacts such areas as functional annotation studies, drug design, quaternary structure prediction, and protein biochemistry [1]. ...

متن کامل

Bayesian approach to inference of population structure

Methods of inferring the population structure‎, ‎its applications in identifying disease models as well as foresighting the physical and mental situation of human beings have been finding ever-increasing importance‎. ‎In this article‎, ‎first‎, ‎motivation and significance of studying the problem of population structure is explained‎. ‎In the next section‎, ‎the applications of inference of p...

متن کامل

Modified signed log-likelihood test for the coefficient of variation of an inverse Gaussian population

In this paper, we consider the problem of two sided hypothesis testing for the parameter of coefficient of variation of an inverse Gaussian population. An approach used here is the modified signed log-likelihood ratio (MSLR) method which is the modification of traditional signed log-likelihood ratio test. Previous works show that this proposed method has third-order accuracy whereas the traditi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Molecular biology and evolution

دوره 21 9  شماره 

صفحات  -

تاریخ انتشار 2004